SQL function
Back to DuckDB Data Engineering Glossary
Overview
A SQL function is a reusable piece of code that performs a specific operation and returns a result. Functions in SQL act like mathematical functions - they take inputs (called arguments or parameters), process them according to defined rules, and produce an output. SQL functions help make queries more readable and maintainable by encapsulating common operations.
Types of SQL Functions
DuckDB supports several categories of functions:
Scalar functions operate on a single row and return one value per row. For example, UPPER('hello')
converts a string to uppercase, returning 'HELLO'.
Aggregate functions operate on multiple rows and return a single summary value. For example, AVG(salary)
would compute the average of all salary values in the selected rows.
Window functions perform calculations across a specified set of table rows related to the current row. For example, ROW_NUMBER() OVER (ORDER BY salary DESC)
assigns sequential numbers to rows based on the salary ordering.
Table functions return a result set that can be used like a table in a query. For example, generate_series(1, 5)
creates a table with values from 1 to 5.
DuckDB Examples
Basic scalar function usage:
Copy code
SELECT ROUND(3.14159, 2) AS rounded_pi;
SELECT CONCAT(first_name, ' ', last_name) AS full_name FROM employees;
Aggregate function with grouping:
Copy code
SELECT department_id,
COUNT(*) as emp_count,
AVG(salary) as avg_salary
FROM employees
GROUP BY department_id;
Window function example:
Copy code
SELECT employee_name,
salary,
RANK() OVER (ORDER BY salary DESC) as salary_rank
FROM employees;
Table function usage:
Copy code
SELECT * FROM generate_series(1, 5) AS t(num);
SELECT * FROM read_csv_auto('data.csv');
Custom Functions
DuckDB allows you to create custom functions using Python through its Python API. This enables you to extend DuckDB's functionality with your own specialized operations. Custom functions can be particularly useful when you need to implement business-specific logic or complex calculations not available in standard SQL.
Best Practices
When using SQL functions, consider performance implications - some functions may require full table scans or complex computations. Use appropriate indexes when functions are used in WHERE clauses. Choose the right function type for your needs - scalar functions for row-by-row operations, aggregate functions for summaries, and window functions for relative calculations.